3

I have a pandas dataframe:

import pandas as pd
test = pd.DataFrame({'words':[['foo','bar none','scare','bar','foo'],
                              ['race','bar none','scare'],
                              ['ten','scare','crow bird']]})

I'm trying to get a word/phrase count of all the list elements in the dataframe colunn. My current solution is:

allwords = []

for index, row in test.iterrows():
    for word in row['words']:
        allwords.append(word)
from collections import Counter
pd.Series(Counter(allwords)).sort_values(ascending=False)

This works, but I was wondering if there was a faster solution. Note: I'm not using ' '.join() because I don't want the phrases to be split into individual words.

3 Answers 3

5

Let's try .hstack with .value_counts:

pd.value_counts(np.hstack(test['words']))

scare        3
foo          2
bar none     2
ten          1
bar          1
crow bird    1
race         1
dtype: int64
Sign up to request clarification or add additional context in comments.

Comments

3

Try using Counter:

import collections
words = test['words'].tolist()

collections.Counter([x for sublist in words for x in sublist])

Counter({'foo': 2,
         'bar none': 2,
         'scare': 3,
         'bar': 1,
         'race': 1,
         'ten': 1,
         'crow bird': 1})

Comments

2

For improve performance dont use iterrows:

from collections import Counter
from  itertools import chain

a = pd.Series(Counter(chain.from_iterable(test['words']))).sort_values(ascending=False)
print (a)
scare        3
foo          2
bar none     2
bar          1
race         1
ten          1
crow bird    1
dtype: int64

Pandas only solution:

a = pd.Series([y for x in test['words'] for y in x]).value_counts()
print (a)
scare        3
bar none     2
foo          2
bar          1
race         1
crow bird    1
ten          1
dtype: int64

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.