0

I'm trying to understand why the output image when plotting the clusters do not have a Y value. All of the points are (x, 0). The data used in this example is a numpy array of shape (125, 532).

[[ 0.85269496  0.         -0.42126083 ... -0.09019524 -0.09706005
-0.09370346]
...
[-1.01090257  0.          0.64767467 ... -0.09020601 -0.10006334
-0.09273296]]

I'm still getting meaningful clusters so it doesn't seem to have any impact on the analysis but I'm just curious why the output is that way (if it's not just a programming mistake).

Here is the code for plotting the points. It's taken almost verbatim from the code on the scikit's page for DBSCAN.

ms = MeanShift()
X1 = StandardScaler().fit_transform(X)
ms.fit(X1)
labels = ms.labels_
unique = numpy.unique(labels)

plt.figure()
colors = [plt.cm.Spectral(each)
            for each in numpy.linspace(0, 1, len(unique))]
for k, col in zip(unique, colors):
    if k == -1:
        col = [0, 0, 0, 1]

    class_member_mask = (labels == k)

    xy = X1[class_member_mask]
    plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=tuple(col), markeredgecolor='k', markersize=14)

    xy = X1[class_member_mask]
    plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=tuple(col), markeredgecolor='k', markersize=6)

plt.title("Clusters")
plt.savefig(plotfn)
plt.close()

And here is the output image.

cluster plot

0

1 Answer 1

1

Supposedly your second attribute is constant 0.

In the two example rows provided it certainly is 0.

What is X1[:,1].max()? You then clearly can drop the entire column.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.