0

I have a DataFrame 'df':

        id value
0      ABC    hi
1      XYZ   hey

that I want to compare to a list of strings 'str_list':

str_list = ['abc_123', 'xyz_456']

to find partial matches to then replace the value if the partial match on 'id' is found in the str_list to make something like this:

        id       value
0      ABC   new_value
1      XYZ   new_value

As of now I have this code:

df.loc[df['id'].isin(str_list), 'value'] = 'new_val'

but that only works on complete matches (so the df 'id' values would have to be abc_123,xyz_456) in ordre to see the new_vals added.

How can I modify this to accept partial matches?

import pandas as pd
str_list = ['abc_123', 'xyz_456']
df = pd.DataFrame({'id':['ABC','XYZ'], 'value':['hi','hey']})
# this commented out df will trigger the matches correctly
#df = pd.DataFrame({'id':['abc_123','xyz_456'], 'value':['hi','hey']})
print(df)
df.loc[df['id'].isin(str_list), 'value'] = 'new_val'
print(df)
1
  • Please accept as solution by clicking the checkmark next to the best solution. Commented Oct 7, 2020 at 17:09

2 Answers 2

1

You can use some list comprehension for this task to see if a lower() of your dataframe value is in the list.

import pandas as pd
str_list = ['abc_123', 'xyz_456']
df = pd.DataFrame({'id':['ABC','XYZ'], 'value':['hi','hey']})

df['match'] = df['id'].apply(lambda x: min([y for y in str_list if x.lower() in y]))
df

Out[1]: 
    id value    match
0  ABC    hi  abc_123
1  XYZ   hey  xyz_456
Sign up to request clarification or add additional context in comments.

Comments

1
#Create new dataframe

 df2=pd.DataFrame({'text':str_list})

#Compute df['value'] using map by creating a dict from new datframe

df['value']=df.id.map(dict(zip(df2['text'].str.upper().str.split('_').str[0],df2['text'])))


   id    value
0  ABC  abc_123
1  XYZ  xyz_456

How it works

    #new dataframe
        df2=pd.DataFrame({'text':str_list})
    # new column in new dataframe
         df2['new']=df2['text'].str.upper().str.split('_').str[0]
    #dict of the two columns in new datframe
    d=dict(zip(df2['text'].str.upper().str.split('_').str[0],df2['text']))
    #map dict to initial dataframe
    df['value']=df['id'].map(d)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.