Word frequency per pandas dataframe row

Question

I'm trying to figure out how to get most frequent words per dataframe row - lets say the top 10 most frequent words. I have code that gets me most frequent words for the whole DF, but now I need to be more granular.

import pandas as pd
import numpy as np
df1 = pd.read_csv('C:/temp/comments.csv',encoding='latin-1',names=['client','comments'])
df1.head(3)

Now I can get the most frequent words on the whole df1:

y = pd.Series(' '.join(df1['description']).lower().split()).value_counts()[:10]

how to get that info per df row?

jpp · Accepted Answer · 2018-04-21 23:03:29Z

3

There are several different ways you can do this, depending on whether you want a dataframe, series of dictionaries, or list of dictionaries.

from collections import Counter

# dataframe of word counts per row
res = df['comments'].str.split().apply(pd.value_counts)

# series of dictionaries of word counts, each series entry covering one row
res = df['comments'].str.split().apply(Counter)

# list of dictionaries of word counts, each list item covering one row
res = [Counter(x) for x in df['comments'].str.split()]

answered Apr 21, 2018 at 23:03

jpp

166k37 gold badges301 silver badges363 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Word frequency per pandas dataframe row

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related