3

I have a dataframe with two columns one is Date and the other one is Location(Object) datatype, below is the format of Location columns with values :

 Date                                            Location
1     07/12/1912                            AtlantiCity, New Jersey   
2     08/06/1913                 Victoria, British Columbia, Canada   
3     09/09/1913                                 Over the North Sea   
4     10/17/1913                         Near Johannisthal, Germany   
5     03/05/1915                                    Tienen, Belgium   
6     09/03/1915                              Off Cuxhaven, Germany   
7     07/28/1916                              Near Jambol, Bulgeria   
8     09/24/1916                                Billericay, England   
9     10/01/1916                               Potters Bar, England   
10    11/21/1916                                     Mainz, Germany

my requirement is to split the Location by "," separator and keep only the second part of it (ex. New Jersey, Canada, Germany, England etc..) in the Location column. I also have to check if its only a single element (values with single element having no ",")

Is there a way I can do it with predefined method without looping each and every row ?

Sorry if the question is off the standard as I am new to Python and still learning.

2 Answers 2

3

A straight forward way is to apply the split method to each element of the column and pick up the last one:

df.Location.apply(lambda x: x.split(",")[-1])

1             New Jersey
2                 Canada
3     Over the North Sea
4                Germany
5                Belgium
6                Germany
7               Bulgeria
8                England
9                England
10               Germany
Name: Location, dtype: object

To check if each cell has only one element we can use str.contains method on the column:

df.Location.str.contains(",")

1      True
2      True
3     False
4      True
5      True
6      True
7      True
8      True
9      True
10     True
Name: Location, dtype: bool
Sign up to request clarification or add additional context in comments.

Comments

1

We could try with str.extract

print(df['Location'].str.extract(r'([^,]+$)'))    
#0            New Jersey
#1                Canada
#2    Over the North Sea
#3               Germany
#4              Belgium 
#5               Germany
#6              Bulgeria
#7               England
#8               England
#9               Germany

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.