-1
from sklearn.cluster import KMeans
cs = []
for i in range(1, 11):
    kmeans = KMeans(n_clusters = i, init = 'k-means++', max_iter = 300, n_init = 10, random_state = 0)
    kmeans.fit(X)
    cs.append(kmeans.inertia_)
plt.plot(range(1, 11), cs)
plt.title('The Elbow Method')
plt.xlabel('Number of clusters')
plt.ylabel('CS')
plt.show()

when I run this code, I have the following error:

AttributeError: 'NoneType' object has no attribute 'split'

3
  • It would be most helpful if you could indicate which line of the code causes the error. One issue that I spot right away is that you can't use range(1,11) as argument to plot. It should be list(range(1,11)). Commented Nov 29, 2024 at 6:51
  • Also, what exactly is in the variable X? Commented Nov 29, 2024 at 6:52
  • This code has no problem and it works! the problem\Error is because The split attribute is used internally by KMeans to process data. If X is None or has invalid contents, KMeans may attempt operations on NoneType values, resulting in the error. Fixing the input data (X) should resolve the error. If you encounter further issues, share the full code or dataset for more meaningful debugging! Your problem is very elementary. Commented Nov 29, 2024 at 7:29

1 Answer 1

0

Your scripts working but I assume that your X is not valid if you see my comment under your post.

You can try following based one "Understanding Distortion and Inertia in K-Means Clustering"

So let's generate quickly synthetic data with valid X values:

import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans

# Generate synthetic dataset
X, _ = make_blobs(n_samples=300, centers=4, cluster_std=1.0, random_state=42)

# Initialize variables for the elbow method
K = range(1, 11)
distortions = []
inertias = []

# Calculate distortions and inertias for different numbers of clusters
for k in K:
    kmeans = KMeans(n_clusters=k, init='k-means++', max_iter=300, n_init=10, random_state=0)
    kmeans.fit(X)
    distortions.append(kmeans.inertia_)
    inertias.append(kmeans.inertia_)

# Cluster the data for visualization
optimal_k = 4  # Assume 4 clusters for visualization
kmeans_optimal = KMeans(n_clusters=optimal_k, init='k-means++', max_iter=300, n_init=10, random_state=0)
y_kmeans = kmeans_optimal.fit_predict(X)

# Create a 2x2 layout with bottom two subplots merged for better visibility
fig = plt.figure(figsize=(12, 10))

# Plot 1: Original Data
ax1 = fig.add_subplot(2, 2, 1)
ax1.scatter(X[:, 0], X[:, 1], s=50)
ax1.set_title("Original Data")
ax1.grid(True)

# Plot 2: Clustered Data
ax2 = fig.add_subplot(2, 2, 2)
ax2.scatter(X[:, 0], X[:, 1], c=y_kmeans, s=50, cmap='viridis')
ax2.scatter(kmeans_optimal.cluster_centers_[:, 0], kmeans_optimal.cluster_centers_[:, 1], 
            s=200, c='red', marker='X', label='Centroids')
ax2.legend()
ax2.set_title(f"Clustered Data with {optimal_k} Clusters")
ax2.grid(True)

# Plot 3: Combined Elbow Method (Distortion and Inertia)
ax3 = fig.add_subplot(2, 1, 2)
ax3.plot(K, distortions, 'bx-', label='Distortion (Inertia)')
ax3.plot(K, inertias, 'go-', label='Inertia', markerfacecolor='none')
ax3.axvline(x=4, color='red', linestyle='--', label='Optimal k=4')
ax3.set_xlabel('Number of Clusters (k)')
ax3.set_ylabel('Value\n  Within-Cluster Sum of Squares (WCSS)')
ax3.set_title('The Elbow Method (Distortion and Inertia)')
ax3.legend()
ax3.grid(True)

# Adjust layout and show the plots
plt.tight_layout()
plt.show()

img

Sign up to request clarification or add additional context in comments.

2 Comments

what numpy version did you use?
numpy = '1.26.4'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.