Using SHAP to explain DNN model but my summary_plot is only showing the average impact of each feature and doesn't include all features

Question

So I am generating a shap summary plot like so:

explainer = shap.KernelExplainer(model, X_test[:100,:])
shap_values = explainer.shap_values(X_test[:100,:])
fig = shap.summary_plot(shap_values, features=X_test[:100,:], feature_names=feature_names, show=False)
plt.savefig('test.png')

This works okay and creates a plot that looks like:

This looks okay but there is a couple problems. From reading up on shap summary_plots i frequently see ones that look like this:

As you can see - this looks a bit different from mine. Based on the text at the bottom of both summary_plots it looks like mine is showing the average shap value for each features whereas the ones i see online are just showing each individual data point for each feature - in other words the ones i see online appear more granular.

How can i create a summary_plot that does not show the average impact for each feature but just each data point? I figured there must be a boolean param to summary_plot() like use_average or something but can't find anything.

Also, as you can see on my summary_plot - only 20 features are being included on the y axis. My model actually has around 100 features and I would like to include all of them in the summary_plot if possible. I figured shap defaults to showing 20 but am hoping there is a way to increase this number.

Rajesh · Accepted Answer · 2021-05-20 16:09:21Z

3

My understanding is shap.summary_plot plots only a bar plot, when the model has more than one output, or even if SHAP believes that it has more than one output (which was true in my case). When I tried forcing the plot to a "dot" using plot_type option for summary_plot, it gave me an assertion error explaining this problem.

You can try replicating that error message with:

shap.summary_plot(shap_values, x_train, plot_type='dot', show = False)

If you get the same error, then try this for the first output variable in your model:

shap.summary_plot(shap_values[0], x_train, show = False)

That seems to have solved my problem.

And as for trying to increase the number of parameters, I believe max_display option should help, although I haven't tried it past 20 (my model is not that big):

shap.summary_plot(shap_values[0], x_train, max_display = 5, show = False)

I hope it helps. Good luck :)

edited May 20, 2021 at 16:09

answered May 20, 2021 at 14:53

Rajesh

464 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Soerendip Over a year ago

What do you mean with 'the model has more than one output'? I have a binary target, does that count as one or two outputs? When I try to do the 'dot' plot it complains. It's a bit weird.

Collectives™ on Stack Overflow

Using SHAP to explain DNN model but my summary_plot is only showing the average impact of each feature and doesn't include all features

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related