4

I'm trying to figure out how to get most frequent words per dataframe row - lets say the top 10 most frequent words. I have code that gets me most frequent words for the whole DF, but now I need to be more granular.

import pandas as pd
import numpy as np
df1 = pd.read_csv('C:/temp/comments.csv',encoding='latin-1',names=['client','comments'])
df1.head(3)

enter image description here

Now I can get the most frequent words on the whole df1:

y = pd.Series(' '.join(df1['description']).lower().split()).value_counts()[:10]

how to get that info per df row?

0

1 Answer 1

3

There are several different ways you can do this, depending on whether you want a dataframe, series of dictionaries, or list of dictionaries.

from collections import Counter

# dataframe of word counts per row
res = df['comments'].str.split().apply(pd.value_counts)

# series of dictionaries of word counts, each series entry covering one row
res = df['comments'].str.split().apply(Counter)

# list of dictionaries of word counts, each list item covering one row
res = [Counter(x) for x in df['comments'].str.split()]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.