Classifier.predict In Python

Question

I have this code in Python:

def plot_decision_regions(X, y, classifier, resolution = 0.02):


    markers = ('s', 'x', 'o', '^','v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])


    x1_min, x1_max = X[:, 0].min() -1, X[:,0].max() + 1
    x2_min, x2_max = X[:, 1].min() -1, X[:,1].max() + 1
   xx1, xx2= np.meshgrid (np.arange(x1_min, x1_max, resolution), np.arange(x2_min, x2_max, resolution))
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    Z = Z.reshape(xx1.shape)
    plt.contourf(xx1, xx2, Z, alpha= 0.3, cmap = cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())


for idx, cl in enumerate (np.unique(y)):
    plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], marker= markers [idx], label = cl, edgecolor = 'black')

Where X is a 100x2 vector with normal data (sepal and petla length for 2 kinds of flowers) , y is a 100x1 vector with only -1 and 1 values (class label vector) and Classifier = Perceptron . I don't know why I need to calculate the transpose

Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)

What does

classifier.predict

and

x=X[y == cl, 0], y= X[y == cl, 1]

in

plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], marker= markers [idx], label = cl, edgecolor = 'black')

do?

I previously load a dataframe, define my predict method, define X and y

def predict(self,X):
    '''Return class label after unit step'''
    return np.where(self.net_input(X) >= 0.0, 1, -1)

And my class = Perceptron contains the w_ (weights) that are adjusted when iterating. Sorry if my english is not perfect

y = df.iloc[0:100 , 4] .values
y= np.where (y == 'Iris-setosa', -1, 1)

X= df.iloc[0:100, [0,2]].values

Can you give us some more detail about the context? I.e.: what you are trying to classify? What are the parameters X and y in the function call? What do they contain (i.e.: X contains data and y the class labels)? — Matteo Meil
– Matteo Meil, Commented May 31, 2020 at 17:36
Can you post enough code to reproduce your results? this should include a random X and y array, colors, and markers. — Jay Mody
– Jay Mody, Commented May 31, 2020 at 17:58
I put more information. For colors and markers they were defined in the code — Chob
– Chob, Commented May 31, 2020 at 18:09

Jay Mody · Accepted Answer · 2020-05-31 18:30:27Z

Let's break this down, first:

np.array([xx1.ravel(), xx2.ravel()])

.ravel() flattens the xx1 and xx2 arrays. xx1 and xx2 are just coordinates (for feature1 and feature2 respectively) arranged in a grid pattern. Idea is that xx1 and xx2 are coordinates at every resolution interval in the range of the feature-set. With enough of these coordinates, you can effectively know what regions are classified as what label by your classifier.

np.array([xx1.ravel(), xx2.ravel()]).T

The reason you need the transpose is because the .predict() method expects as input an array of size [n_samples, n_features]. The result of the ravelled array will be of size [n_features, n_samples], which is why we need to transpose.

classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T

This makes the predictions for each of the meshgrid points (which is then used to make a mask over the plot to show which regions are classified as what label by the classifier).

plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], marker= markers [idx], label = cl, edgecolor = 'black')

Here, we plot our samples. We want to plot each class of samples seperately (in order to have them be different colors), so x=X[y == cl, 0] and y= X[y == cl, 1] are saying only plot the point at points where the label is equal to the current one we are inspecting (ie cl). cl will just be an iteration of all the unique possible labels.

It's easier to understand once you see what the result looks like (here's an example using a make_blobs dataset and an MLPClassifier:

import numpy as np
import matplotlib.pyplot as plt

from matplotlib.colors import ListedColormap
from sklearn.datasets import make_blobs
from sklearn.neural_network import MLPClassifier

def plot_decision_regions(X, y, classifier, resolution = 0.02):
    markers = ('s', 'x', 'o', '^','v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])

    x1_min, x1_max = X[:, 0].min() -1, X[:,0].max() + 1
    x2_min, x2_max = X[:, 1].min() -1, X[:,1].max() + 1
    xx1, xx2= np.meshgrid (np.arange(x1_min, x1_max, resolution), np.arange(x2_min, x2_max, resolution))
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    Z = Z.reshape(xx1.shape)
    plt.contourf(xx1, xx2, Z, alpha= 0.3, cmap = cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())

colors = ['red', 'blue', 'green']
X, y = make_blobs(n_features=2, centers=3)

for idx, cl in enumerate (np.unique(y)):
    plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], label = cl, edgecolor = 'black')

classifier = MLPClassifier()
classifier.fit(X, y)

plot_decision_regions(X, y, classifier, resolution = 0.02)

You get:

Thanks you, you explain very well. One least question: Why I must do Z = Z.reshape(xx1.shape) ? I think it is to recover the shape that the vector had before transposing it
Its to recover the shape of the predictions to the shape of the meshgrid, since that's what plt.contourf accepts. Before the reshape, Z is a one dimensional vector.

Collectives™ on Stack Overflow

Classifier.predict In Python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related