3

So I am generating a shap summary plot like so:

explainer = shap.KernelExplainer(model, X_test[:100,:])
shap_values = explainer.shap_values(X_test[:100,:])
fig = shap.summary_plot(shap_values, features=X_test[:100,:], feature_names=feature_names, show=False)
plt.savefig('test.png')

This works okay and creates a plot that looks like:

my summary plot

This looks okay but there is a couple problems. From reading up on shap summary_plots i frequently see ones that look like this:

example summary plot how i want mine to look

As you can see - this looks a bit different from mine. Based on the text at the bottom of both summary_plots it looks like mine is showing the average shap value for each features whereas the ones i see online are just showing each individual data point for each feature - in other words the ones i see online appear more granular.

How can i create a summary_plot that does not show the average impact for each feature but just each data point? I figured there must be a boolean param to summary_plot() like use_average or something but can't find anything.

Also, as you can see on my summary_plot - only 20 features are being included on the y axis. My model actually has around 100 features and I would like to include all of them in the summary_plot if possible. I figured shap defaults to showing 20 but am hoping there is a way to increase this number.

1 Answer 1

3

My understanding is shap.summary_plot plots only a bar plot, when the model has more than one output, or even if SHAP believes that it has more than one output (which was true in my case). When I tried forcing the plot to a "dot" using plot_type option for summary_plot, it gave me an assertion error explaining this problem.

You can try replicating that error message with:

shap.summary_plot(shap_values, x_train, plot_type='dot', show = False)

If you get the same error, then try this for the first output variable in your model:

shap.summary_plot(shap_values[0], x_train, show = False)

That seems to have solved my problem.

And as for trying to increase the number of parameters, I believe max_display option should help, although I haven't tried it past 20 (my model is not that big):

shap.summary_plot(shap_values[0], x_train, max_display = 5, show = False)

I hope it helps. Good luck :)

Sign up to request clarification or add additional context in comments.

1 Comment

What do you mean with 'the model has more than one output'? I have a binary target, does that count as one or two outputs? When I try to do the 'dot' plot it complains. It's a bit weird.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.