Creating new column based on multiple different values

Question

I have current code below that creates a new column based on multiple different values of a column that has different values representing similar things such as Car, Van or Ship, Boat, Submarine that I want all to be classified under the same value in the new column such as Vehicle or Boat.

Code with Simplified Dataset example:

def f(row):
    if row['A'] == 'Car':
        val = 'Vehicle'
    elif row['A'] == 'Van':
        val = 'Vehicle'
    elif row['Type'] == 'Ship'
        val = 'Boat'
    elif row['Type'] == 'Scooter'
        val = 'Bike'
    elif row['Type'] == 'Segway'
        val = 'Bike'
    return val

What is best method similar to using wildcards rather than type each value out if there are multiple values (30 plus values ) that I want to bucket into the same new values under the new column?

Thanks

Henry Yik · Accepted Answer · 2020-01-13 04:20:45Z

2

One way is to use np.select with isin:

df = pd.DataFrame({"Type":["Car","Van","Ship","Scooter","Segway"]})

df["new"] = np.select([df["Type"].isin(["Car","Van"]),
                       df["Type"].isin(["Scooter","Segway"])],
                      ["Vehicle","Bike"],"Boat")

print (df)

      Type      new
0      Car  Vehicle
1      Van  Vehicle
2     Ship     Boat
3  Scooter     Bike
4   Segway     Bike

answered Jan 13, 2020 at 4:20

Henry Yik

22.6k5 gold badges21 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Creating new column based on multiple different values

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related