6

I am trying to compute PDF estimate from KDE computed using scikit-learn module. I have seen 2 variants of scoring and I am trying both: Statement A and B below.

Statement A results in following error:

AttributeError: 'KernelDensity' object has no attribute 'tree_'

Statement B results in following error:

ValueError: query data dimension must match training data dimension

Seems like a silly error, but I cannot figure out. Please help. Code is below...

from sklearn.neighbors import KernelDensity
import numpy

# d is my 1-D array data
xgrid = numpy.linspace(d.min(), d.max(), 1000)

density = KernelDensity(kernel='gaussian', bandwidth=0.08804).fit(d)

# statement A
density_score = KernelDensity(kernel='gaussian', bandwidth=0.08804).score_samples(xgrid)

# statement B
density_score = density.score_samples(xgrid)

density_score = numpy.exp(density_score)

If it helps, I am using 0.15.2 version of scikit-learn. I've tried this successfully with scipy.stats.gaussian_kde so there is no problem with data.

2 Answers 2

10

With statement B, I had the same issue with this error:

 ValueError: query data dimension must match training data dimension

The issue here is that you have 1-D array data, but when you feed it to fit() function, it makes an assumption that you have only 1 data point with many dimensions! So for example, if your training data size is 100000 points, the your d is 100000x1, but fit takes them as 1x100000!!

So, you should reshape your d before fitting: d.reshape(-1,1) and same for xgrid.shape(-1,1)

density = KernelDensity(kernel='gaussian', bandwidth=0.08804).fit(d.reshape(-1,1))
density_score = density.score_samples(xgrid.reshape(-1,1))

Note: The issue with statement A, is that you are using score_samples on an object which is not fit yet!

Sign up to request clarification or add additional context in comments.

Comments

1

You need to call the fit() function before you can sample from the distribution.

1 Comment

I am calling. See line density = KernelDensity(kernel='gaussian', bandwidth=0.08804).fit(d)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.