Create a new column from two existing text columns in a DataFrame using pandas/python

Question

I have a Dataframe with two columns "Start_location" and "end_location". I want to create a new column called "location" from the 2 previous columns with the following conditions.

If the values of "start_location" == "end_location", then the value of "location" will be either of the values of the first two columns. else, if the values of of "start_location" and "end_location are different, then values of "Location" will be "start_location"-"end_location".

An example of what I want is this.

+---+--------------------+-----------------------+
|   |  Start_location    |      End_location     |
+---+--------------------+-----------------------+
| 1 | Stratford          |      Stratford        |
| 2 | Bromley            |      Stratford        |
| 3 | Brighton           |      Manchester       |
| 4 | Delaware           |      Delaware         |
+---+--------------------+-----------------------+

The result I want is this.

+---+--------------------+-----------------------+--------------------+
|   |  Start_location    |      End_location     |   Location         |
+---+--------------------+-----------------------+--------------------+
| 1 | Stratford          |      Stratford        |   Stratford        |
| 2 | Bromley            |      Stratford        | Brombley-Stratford |
| 3 | Brighton           |      Manchester       | Brighton-Manchester|
| 4 | Delaware           |      Delaware         |    Delaware        |
+---+--------------------+-----------------------+--------------------+

I would be happy if anyone can help.

PS- forgive me if this is a very basic question. I have gone through some similar questions on this topic but couldn't get a headway.

Chris Schmitz · Accepted Answer · 2020-07-22 13:10:13Z

2

You can make your own function that does this and then use apply and a lambda function:

def get_location(start, end):
    if start == end:
        return start
    else:
        return start + ' - ' + end

df['location'] = df.apply(lambda x: get_location(x.Start_location, x.End_location), axis = 1)

answered Jul 22, 2020 at 13:10

Chris Schmitz

6681 gold badge6 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mini Fridge · Accepted Answer · 2020-07-22 13:10:16Z

1

df['Location'] = df[['start_location','end_location']].apply(lambda x: x[0] if x[0] == x[1] else x[0] + '-' + x[1], axis = 1)

answered Jul 22, 2020 at 13:10

Mini Fridge

95812 silver badges29 bronze badges

Comments

Tarequzzaman Khan · Accepted Answer · 2020-07-22 13:29:38Z

1

You can use Numpy to compare both columns. Follow This code


import numpy as np

df["Location"] =  np.where((df['Start_location'] == df['End_location'])
                           , df['Start_location'],df['Start_location']+"-"+ df['End_location'])

df

Output:

    Start_location  End_location    Location
0   Stratford        Stratford      Stratford
1   Bromley          Stratford  Bromley-Stratford
2   Brighton         Manchester Brighton-Manchester
3   Delaware         Delaware        Delaware

edited Jul 22, 2020 at 13:29

answered Jul 22, 2020 at 13:20

Tarequzzaman Khan

4942 silver badges16 bronze badges

Comments

Tarequzzaman Khan · Accepted Answer · 2020-07-29 18:53:04Z

1

Use np.select(condition, choice). To join start, use .str.cat() method

import numpy as np

condition=[df['Start_location']==df['End_location'],df['Start_location']!= df['End_location']]
choice=[df['Start_location'], df['Start_location'].str.cat(df['End_location'], sep='_')]
df['Location']=np.select(condition, choice)

df

Start_location End_location             Location
1      Stratford    Stratford            Stratford
2        Bromley    Stratford    Bromley_Stratford
3       Brighton   Manchester  Brighton_Manchester
4       Delaware     Delaware             Delaware

edited Jul 29, 2020 at 18:53

Tarequzzaman Khan

4942 silver badges16 bronze badges

answered Jul 22, 2020 at 13:15

wwnde

26.7k6 gold badges22 silver badges38 bronze badges

Collectives™ on Stack Overflow

Create a new column from two existing text columns in a DataFrame using pandas/python

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related