Skip to main content
AI Assist is now on Stack Overflow. Start a chat to get instant answers from across the network. Sign up to save and share your chats.
Filter by
Sorted by
Tagged with
5 votes
7 answers
169 views

I have a dataframe that contains columns that have categorical responses. I'd like to perform label enconding of the observations on all the columns at a go Gender <- c("Male", "...
andrew's user avatar
  • 2,129
0 votes
0 answers
20 views

I am working on Python's jupiter notebook platform.I have a dataset whose output consists of 4 different labels and whose output data is a categorical variable. I coded this output data with Label ...
MZNG's user avatar
  • 11
0 votes
3 answers
74 views

I have a LabelEncoder with 500 classes. To store and load it, I used pickle: with open('../data/label_encoder_v500.pkl', 'rb') as file: label_encoder = pickle.load(file) I want to add 24 new ...
TkrA's user avatar
  • 716
3 votes
3 answers
1k views

I'm trying to use scikit-learn's LabelEncoder with a Polars DataFrame to encode a categorical column. I am using the following code. import polars as pl from sklearn.preprocessing import LabelEncoder ...
Simon's user avatar
  • 1,209
0 votes
1 answer
281 views

I was trying to convert string data into numerical data in a CSV excel sheet. It kept giving me an error about previously unseen labels, so I searched it up and found that we can use Label Encoder to ...
Kevin Phillips's user avatar
0 votes
0 answers
190 views

I need to predict different types of exploitation using a RandomForestClassifier. My dataset contains several categorical variables such as gender, citizenship, and CountryOfExploitation. These ...
tswift1998's user avatar
1 vote
0 answers
66 views

I'm attempting to incorporate various transformations into a scikit-learn pipeline along with a LightGBM model. This model aims to predict the prices of second-hand vehicles. Once trained, I plan to ...
alexquilis1's user avatar
1 vote
0 answers
69 views

Problem Description: Label Encoding Issue: Upon rerunning the label encoding code, the labels change, causing inconsistency. Dynamic Data from a Server: Incoming data might introduce new values, ...
Zeeshan Khalid's user avatar
0 votes
2 answers
95 views

I need to encode a column having 4 classes i.e., Education with classes Bachelor's, Master's, PhD, and High School. When I fit the label encoder to the training set (tr, here) and transform the test ...
Aditya Shandilya's user avatar
1 vote
1 answer
165 views

# Import necessary libraries import pandas as pd import numpy as np from sklearn.model_selection import train_test_split from xgboost import XGBRegressor from sklearn.metrics import mean_squared_error,...
RockMachine98's user avatar
0 votes
0 answers
18 views

I worked on a classification problem for the Kaggle Competition, and I found there are 3 categorical columns, but I noticed that the service column in the train data has 66 categories, and in the test ...
user22689820's user avatar
0 votes
1 answer
41 views

I am building a model where LabelEncoding of 2 categorical columns is a better approach. So I had implemented the same on the train_df and finalized the model. And for predicting the test_df, I used ...
SM079's user avatar
  • 797
0 votes
0 answers
69 views

I'm trying to run a neural network. And I have some questions related to this error, beacuase I'm trying to convert categorical inputs into numerical ordinary input. So the data equals: data base so I ...
Leonor Magalhães's user avatar
4 votes
0 answers
15k views

I am trying to build a KNearest Neighbor system that will help me classify distances. The columns from the original dataframe have columns totalDistance and Label. To use KNN I have to encode the ...
Jesper Ezra's user avatar
-1 votes
1 answer
68 views

I am working on an image segmentation problem but I faced this issue. I am trying to create encode labels with multi diminutional array but it need to be flatten, encode and reshape -------------------...
Bet's user avatar
  • 19
-1 votes
1 answer
78 views

uv = np.unique(X[:, 2]) uv2 = np.unique(X_test[:, 2]) print(uv) #['Female' 'Male'] print(uv2) #['Female' 'Male'] # Encoding categorical columns in the train dataset from sklearn.preprocessing ...
Mohamed Aissaoui's user avatar
0 votes
1 answer
395 views

As you can see, I have a preprocessing function here and doing some converting operations. I have some categorical variables and I defined them as categorical_cols, and using LabelEncoder for them. My ...
Zyxnon's user avatar
  • 15
0 votes
2 answers
77 views

How do I know when to apply LabelEncoder() or OneHotEncoder()? I have used LabelEncoder to encode categorical variable for RandomForestRegressor model and it gives a extremely high mean squared error. ...
Kugan's user avatar
  • 1
0 votes
1 answer
903 views

I have built a machine learning model using 34 features. Now I want to check how well the model predicts the new data value. However, initially there were 26 features but one-hot and label encoding ...
Siddharth Nigade's user avatar
0 votes
1 answer
87 views

im working on a data set and i do label encoding for cat features and i tried to do the inverse now but an error appears like this ----> 1 original_labels = labelencoder.inverse_transform(df['model'...
kim seol's user avatar
0 votes
2 answers
49 views

Here is a snippet of my df: 0 1 2 3 4 5 ... 11 12 13 14 15 16 0 BSO PRV BSI TUR WSP ACP ... HLR HEX HEX None None None 1 BSO PRV BSI ...
Tony Sirico's user avatar
0 votes
1 answer
337 views

When I convert data from a pandas dataframe to sklearn so I can make predictions. String data becomes problematic. So I used labelencoder but it seems to limit me to using the encoded data instead of ...
M.Namjoo's user avatar
0 votes
1 answer
73 views

I have a dataframe: data = [['p1', 't1'], ['p4', 't2'], ['p2', 't1'],['p4', 't3'], ['p4', 't3'], ['p3', 't1'],] sdf = spark.createDataFrame(data, schema = ['id', 'text']) sdf.show() +---+--...
Rory's user avatar
  • 383
0 votes
1 answer
42 views

I have a X_train dataframe. One of the columns locale has the unique values: ['Regional', 'Local', 'National']. I am trying to make this column into an Ordered Categorical variable, with the correct ...
Katsu's user avatar
  • 9,075
1 vote
1 answer
142 views

I am on a machine learning project. I did import all libraries. I took one column of data(this column is array of bool) and i want to apply it labelencoder. Here is my whole code. data = pd.read_csv('...
metkopetru's user avatar
1 vote
2 answers
66 views

encoder=LabelEncoder() categorical_features=df.columns.tolist() for col in categorical_features: df[col]=encoder.fit_transform(df[col]) df.head(20) **i want categorical_features to take columns ...
aarthi sharma's user avatar
1 vote
1 answer
46 views

This is my label value: df['Label'].value_counts() ------------------------------------ Benign 4401366 DDoS attacks-LOIC-HTTP 576191 FTP-BruteForce 193360 SSH-...
Dead's user avatar
  • 11
0 votes
1 answer
482 views

I want to label-encode a column called article_id which has unique identifiers for an article. Integer values kind of implicitly have an order to them, because 3 > 2 > 1. I wonder what is the ...
christallclear's user avatar
0 votes
2 answers
479 views

I'm working on a ML webapp and am training data from a CSV file. When converting the data array to float the ValueError appears CODE X[:, 0] = le_country.transform(X[:,0]) X[:, 1] = le_education....
birbhambra's user avatar
2 votes
2 answers
275 views

I want to label encode subgroups in a pandas dataframe. Something like this: | Category | | Name | | ---------- | | --------- | | FRUITS | | Apple | | FRUITS | | Orange | | ...
rohit deraj's user avatar
0 votes
1 answer
1k views

I've created a subclass of the keras.models.Sequential class, so that to override the fit() and predict() functions. My goal is to 'hide' the a sklearn LabelEncoder. This way I can directly call fit() ...
fortune_pickle's user avatar
0 votes
1 answer
604 views

I have created a ML model with Random forest it has 6000+ data with 27 features out of which about 22 were categorical data i have used label encoder on it.Now when i have to predict the result is ...
Muhammad Minhas's user avatar
0 votes
1 answer
638 views

I have a total of around 80 columns out of which some 20 columns are categorical which needs to be label encoded. I checked the solution provided here and the solution stated to work with Feature ...
Sunag's user avatar
  • 75
0 votes
1 answer
160 views

I have a 50 columns, categorical dataset. Among them only 5 columns are numerical. I would like to apply label encoder to make the categorical columns to numerical columns. Categorical columns are ...
Encipher's user avatar
  • 3,488
2 votes
0 answers
141 views

I'm trying to create an automated data pre-processing library and I want to transform the string data into numerical so it can be ran through ML algorithms. But I can't seem to reverse it back to its ...
brockwill1's user avatar
0 votes
1 answer
247 views

I am trying to use dask_cudf to preprocess a very large dataset (150,000,000+ records) for multi-class xgboost training and am having trouble encoding the class column (dtype is string). I tried using ...
Tejas Sriram's user avatar
0 votes
1 answer
2k views

Whenever i am trying to execute the following code it is showing ValueError: y contains previously unseen labels: 'some_label' X_test['Gender'] = le.transform(X_test['Gender']) X_test['Age'] = le....
Nil's user avatar
  • 1
0 votes
2 answers
175 views

I have a dataset of which I have attached an image. The set of unique values in Origin and Dest are same. Upon doing label encoding of those columns, I thought that value ATL will get same encoding in ...
Utkarsh A's user avatar
0 votes
0 answers
529 views

For context, I am taking Ad listing data for Machines and using it to predict the type of Machine. I have used the RandomForestClassifier for class prediction. In the model I have used LabelEncoder to ...
jackyg's user avatar
  • 11
0 votes
1 answer
603 views

I am looking to run classification on a column that has few possible values, but i want to consolidate them into fewer labels. for example, a job may have multiple end states: success, fail, error, ...
Ehud Kaldor's user avatar
0 votes
1 answer
444 views

I have a dataset with 39 categorical and 27 numerical features. I am trying to encode the categorical data and need to be able to inverse transform and call transform for each column again. Is there a ...
Tom_Scott's user avatar
  • 115
2 votes
1 answer
1k views

I would like to LabelEncode a column in pandas where each row contains a list of strings. Since a similar string/text carries a same meaning across rows, encoding should respect that, and ideally ...
TwinPenguins's user avatar
0 votes
1 answer
78 views

I keep getting AttributeError: 'DataFrame' object has no attribute 'column' when I run the function on a column in a dataframe def reform (column, dataframe): if dataframe.column.nunique() > 2 ...
Tolulope Beckley's user avatar
0 votes
0 answers
39 views

I am trying to do label encoding using sci kit learn's built in function but why does my result print as row instead of an additional column? from sklearn.preprocessing import LabelEncoder # creating ...
odebear's user avatar
  • 53
0 votes
1 answer
2k views

I have a dataframe with categorical value like city name for instance. For ML algo., I need then encode the data into numerical value. I do it like this: df[cat_columns] = df[cat_columns].apply(...
pacdev's user avatar
  • 591
1 vote
1 answer
1k views

I'm new to the Python ML using scikit. I was working on a solution to create a model with three columns Pets, Owner and location. import pandas import joblib from sklearn.tree import ...
ItsMeGokul's user avatar
1 vote
1 answer
745 views

I am developing a classification base model. I have used the concept of ColumnTransformer and Pipeline for feature engineering and selection, model selection, and for everything. I wanted to encode my ...
Shreejan Shrestha's user avatar
0 votes
3 answers
2k views

I have the below code snippet: df = pd.read_csv("data.csv") X = df.drop(['label'], axis=1) Y= df['label'] le = LabelEncoder() Y = le.fit_transform(Y) mapping = dict(zip(le.classes_, range(...
chas's user avatar
  • 1,655
1 vote
1 answer
510 views

I try to do label encoding for my cities. However, I want it to label according to which city is more than others. Let's say; Oslo has 500 rows Berlin has 400 rows Napoli has 300 rows in the dataset ...
efc07's user avatar
  • 33
0 votes
1 answer
935 views

Trying to encode data in a csv file. TA in class recommend LabelEncoder in sklearn. There's one column names education_level. And I need to encode it in "High, Medium, Low" order. But the ...
ExcitedMail's user avatar