Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Got ray.serialization.DeserializationError when trying to run KernelShap in distributed mode #1

Open
Chen2908 opened this issue Feb 24, 2021 · 5 comments

Comments

@Chen2908
Copy link

Hello,
I read your documentation and blog post and was interested in running KernelShap in distributed mode.
I've installed ray and alibi[ray] as in the documentation. Other packages were already installed.
Running the following code on a binary dataset (1s and 0s as feature values) raised a serialization issue:

opts = {'n_cpus': 19}
start = time.time()
distrib_explainer = KernelShap(self.func_predict, distributed_opts=opts)
distrib_explainer.fit(background_set)
distib_shap_values = distrib_explainer.explain(record_to_explain, nsamples='auto')
print(str(time.time() - start))

This is the error:
raise DeserializationError()
ray.serialization.DeserializationError

Do you know what is the problem and how it can be resolved? Thanks.

@jklaise
Copy link
Collaborator

jklaise commented Feb 24, 2021

@Chen2908 was this an error using KernelShap from alibi? I would suggest opening an issue on alibi as we can better track it there.

@alexcoca
Copy link
Owner

alexcoca commented Feb 24, 2021

Hi @Chen2908

Thank you for raising this. In order for the distributed computation to work your model has to be serialised in the main process so that another process can be spawned on a different core. I suspect your model cannot be deserialised by ray for some reason.

This is has to do with the code for your model, so the key to solving this would be to understand what part of your code leads to this. I also see that you are trying to call the distributed shap from inside a class. You might want to try to follow the examples more closely and fit your model first, instantiate it, and then pas predict_fcn = model.func_predict to KernelShap.

It is also useful to check what serialisation protocol is used by the current version of ray and check that the serialisation/deserialisation works for your model.

@alexcoca
Copy link
Owner

@Chen2908 Seems ray use a customized version of pickle protocol 5, which runs on Python 3.8. Have a look at how you can call it from their library. They also have some docs about this and a helpful function to pick up serialisation issues. Hope this helps.

@Chen2908
Copy link
Author

Thank you for your reply.
First I would like to point out that running the original Kernel Explainer with the same inputs works fine.
I wanted to give the distributed run a try so I tried to follow your example as much as I can due to changes in my model.
My model is a custom model derived from Scikit Learn's most basic model (ClassifierMixin). I implement the fit and predict functions by myself. The predict returns 0/1.
The 'self.func_predict' you see there calls the model.predict function with another parameter that is required but the record.
I did a little digging in the code and it turns out the error I got before was raised during self.expected_value = self._explainer.expected_value computation (shap_wrappers.py, line 760) and was actually caused by an incompatible version of msgpack package who needs to be 0.6.0.

Currently I'm experiencing a different issue while computing the explain function:
File "/home/.conda/envs/env/lib/python3.7/site-packages/shap/utils/_legacy.py", line 87, in
instance.group_display_values = [instance.x[0, group[0]] if len(group) == 1 else "" for group in data.groups]
IndexError: index 1 is out of bounds for axis 1 with size 1

I tried to debug it and I honestly don't know why it is happening...
Any thoughts?

@alexcoca
Copy link
Owner

alexcoca commented Mar 1, 2021

Hi @Chen2908 ,

This might be an issues with the dimensions of the arrays that you feed in. What is the shape of your background data? What about the input to the model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants