I have created a list of words associated with a certain category. For example:
care = ["safe", "peace", "empathy"]
And I have a dataframe containing speeches, that on average consist of 450 words. I have counted the number of matches for each category using this line of code:
df['Care'] = df['Speech'].apply(lambda x: len([val for val in x.split() if val in care]))
Which gives me the total amount of matches for each category.
However i need to review the frequencies of each word in the list. I tried using this code to solve my problem.
df.Tal.str.extractall('({})'.format('|'.join(auktoritet)))\
.iloc[:, 0].str.get_dummies().sum(level=0)
I've tried different methods but the problems is that i always get partial matches included. For example hammer would be counted for ham.
Any ideas on how to solve this?