1

I have a list like this:

x = ['Las Vegas', 'San Francisco, 'Dallas']

And a dataframe that looks a bit like this:

import pandas as pd
data = [['Las Vegas (Clark County), 25], ['New York', 23], 
        ['Dallas', 27]]
df = pd.DataFrame(data, columns = ['City', 'Value'])

I want to replace my city values in the DF "Las Vegas (Clark County)" with "Las Vegas". In my dataframe are multiple cities with different names which needs to be changed. I know I could do a regex expression to just strip off the part after the parentheses, but I was wondering if there was a more clever, generic way.

1 Answer 1

2

Use Series.str.extract with joined values of list by | for regex OR and then replace non matched values to original by Series.fillna:

df['City'] = df['City'].str.extract(f'({"|".join(x)})', expand=False).fillna(df['City'])
print (df)
        City  Value
0  Las Vegas     25
1   New York     23
2     Dallas     27

Another idea is use Series.str.contains with loop, but it should be slow if large Dataframe and many values in list:

for val in x:
    df.loc[df['City'].str.contains(val), 'City'] = val
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.