0

I have current code below that creates a new column based on multiple different values of a column that has different values representing similar things such as Car, Van or Ship, Boat, Submarine that I want all to be classified under the same value in the new column such as Vehicle or Boat.

Code with Simplified Dataset example:

def f(row):
    if row['A'] == 'Car':
        val = 'Vehicle'
    elif row['A'] == 'Van':
        val = 'Vehicle'
    elif row['Type'] == 'Ship'
        val = 'Boat'
    elif row['Type'] == 'Scooter'
        val = 'Bike'
    elif row['Type'] == 'Segway'
        val = 'Bike'
    return val

What is best method similar to using wildcards rather than type each value out if there are multiple values (30 plus values ) that I want to bucket into the same new values under the new column?

Thanks

1 Answer 1

2

One way is to use np.select with isin:

df = pd.DataFrame({"Type":["Car","Van","Ship","Scooter","Segway"]})

df["new"] = np.select([df["Type"].isin(["Car","Van"]),
                       df["Type"].isin(["Scooter","Segway"])],
                      ["Vehicle","Bike"],"Boat")

print (df)

      Type      new
0      Car  Vehicle
1      Van  Vehicle
2     Ship     Boat
3  Scooter     Bike
4   Segway     Bike
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.