0

I am trying to accomplish something I thought would be easy: Take three columns from my dataframe, use a label encoder to encode them, and simply replace the current values with the new values.

I have a dataframe that looks like this:

|  Order_Num  |  Part_Num  | Site | BUILD_ID |
| MO100161015 | PPT-100K39 | BALT |   A001   |
| MO100203496 | MDF-925R36 | BALT |   A001   |
| MO100203498 | PPT-825R34 | BALT |   A001   |
| MO100244071 | MDF-323DCN | BALT |   A001   |
| MO100244071 | MDF-888888 | BALT |   A005   |

I am essentially trying to use sklearn's LabelEncoder() to switch my String variables to numeric. Currently, I have a function str_to_num where I feed it a column and it returns me an array (column) of the converted data. It works great.

However, I am struggling to remove the old data from my dataframe and add it to the new. My script is below:

from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
import pandas as pd
import numpy as np

# Convert the passed in column
def str_to_num(arr):
    le = preprocessing.LabelEncoder()
    array_of_parts = []
    for x in arr:
        array_of_parts.append(x)

    new_arr = le.fit_transform(array_of_parts)
    return new_arr

# read in data from csv
data = pd.read_csv('test.csv')
print(data)
# Create the new data
converted_column = str_to_num(data['Order_Num'])
print(converted_column)

# How can I replace data['Order_Num'] with the values in converted_column?

# Drop the old data
dropped = data.drop('Order_Num', axis=1)
# Add the new_data column to the place where the old data was?

Given my current script, how can I replace the values in the 'Order_Num' column with those in converted_column? I have tried [pandas.DataFrame.replace][1], but that replaces specific values, and I don't know how to map that to the returned data.

I would hope my expected data to be:

| Order_Num |  Part_Num  | Site | BUILD_ID |
|     0     | PPT-100K39 | BALT |   A001   |
|     1     | MDF-925R36 | BALT |   A001   |
|     2     | PPT-825R34 | BALT |   A001   |
|     3     | MDF-323DCN | BALT |   A001   |
|     3     | MDF-888888 | BALT |   A005   |

My python --version returns

3.6.7

7
  • Please read minimal reproducible example and edit your question accordingly. Specifically, this is not a minimal expample and I don't see your expected results. Commented Jul 24, 2019 at 17:05
  • Curious how this isn't minimal. Can you provide further feedback as opposed to linking me to a post I have read many times? If I don't post the remainder if my script, I get comments my post is not "reproducible". If I post the whole script, I get comments it is not reproducible. I am looking for feedback, please. I can post the expected results, but it should be straight forward given the code. Commented Jul 24, 2019 at 17:08
  • 3
    I can only speak for myself. But as I read your question, my eyes start to glaze over as I see all the code I have to parse. I don't want to spend my precious time sifting through that much code (it's not a tremendous amount, but its more than I want to do for free). Then I think to myself, "Was this the minimum amount of code needed to get OP's point across? Certainly they must have been able to use less code to ask the same question, right? Only way to really tell is to sift through that code... and I don't want to do that." Commented Jul 24, 2019 at 17:11
  • have you tried data['Order_Num'] = str_to_num(data['Order_Num']) Commented Jul 24, 2019 at 17:15
  • 1
    Understood. I tried to edit to be more condense, but still gets the point across. Sorry for wasting your time, I am trying to learn as I go (both in life and on SO =)) @piRSquared Commented Jul 24, 2019 at 17:18

1 Answer 1

2

The beauty of pandas is sometimes understated - often you only need to do something like this:

data['Order_Num'] = str_to_num(data['Order_Num'])

There's also the option of df.apply()

Sign up to request clarification or add additional context in comments.

1 Comment

Glad you saw what was needed. PlusOne.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.