Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST] How to perform saving and loading of multi-GPU cuml dask RandomForestClassifier models? #3404

Open
Santyk1993 opened this issue Jan 24, 2021 · 4 comments
Labels
4 - Waiting on Author Waiting for author to respond to review inactive-30d question Further information is requested

Comments

@Santyk1993
Copy link

There aren't clear documentation examples for saving and loading the 'cuml.dask.ensemble.RandomForestClassifier' trained models.

I have trained the model using 4 GPUs.
After training, I have used the 'get_combined_model()' function on the distributed model to get a single GPU model for pickling, as mentioned in:
https://docs.rapids.ai/api/cuml/stable/pickling_cuml_models.html#Distributed-Model-Pickling

When I try to load this pickled model and use it for prediction, I get an error stating:
" AttributeError: 'NoneType' object has no attribute 'predict' "

I have tried performing the save using pickle and joblib libraries, and I have tried the save file formats: .sav, .pkl, and .model.
All approaches lead to the same error mentioned above.

Can someone advise on this issue?

@Santyk1993 Santyk1993 added ? - Needs Triage Need team to review and classify question Further information is requested labels Jan 24, 2021
@dantegd
Copy link
Member

dantegd commented Jan 24, 2021

Hi @Santyk1993 I believe what you ran into is a bug in the 0.17 version, particularly described by this issue:

#3331

This PR should fix it (once it is merged) for the upcoming versions: #3388

Thanks!

@drobison00 drobison00 added 4 - Waiting on Author Waiting for author to respond to review and removed ? - Needs Triage Need team to review and classify labels Feb 18, 2021
@drobison00
Copy link
Contributor

@Santyk1993 did this resolve the issues you were seeing?

@Santyk1993
Copy link
Author

@drobison00 I read in issue #3388 that the issue with pickling the 'cuml.Dask' version of Random Forest Classification is now fixed. That's great to hear.

I had worked around the issue at that time, by simply not saving the model, and using it in the same run.
Currently, I am not running the Dask version of cuml RF, but I shall update in this thread as soon as I try it.

Thank you.

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - Waiting on Author Waiting for author to respond to review inactive-30d question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants