I'm trying to iterate through a dataframe column to extract a certain set of words. I'm mapping these as key value pairs in a dictionary and have with some help managed to set on key per row so far.
Now, what I would like to do is return multiple keys in the same row if the values are present in the string and these should be separated by a | (pipe).
Code:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Name': ['Red and Blue Lace Midi Dress', 'Long Armed Sweater Azure and Ruby',
'High Top Ruby Sneakers', 'Tight Indigo Jeans',
'T-Shirt Navy and Rose']})
colour = {'red': ('red', 'rose', 'ruby'), 'blue': ('azure', 'indigo', 'navy')}
def fetchColours(x):
for key, values in colour.items():
for value in values:
if value in x.lower():
return key
else:
return np.nan
df['Colour'] = df['Name'].apply(fetchColours)
Output:
Name Colour
0 Red and Blue Lace Midi Dress red
1 Long Armed Sweater Azure and Ruby blue
2 High Top Ruby Sneakers red
3 Tight Indigo Jeans blue
4 T-Shirt Navy and Rose blue
Expected result:
Name Colour
0 Red and Blue Lace Midi Dress red
1 Long Armed Sweater Azure and Ruby blue|red
2 High Top Ruby Sneakers red
3 Tight Indigo Jeans blue
4 T-Shirt Navy and Rose blue|red