3

Through R, I can easily make a data frame containing the frequencies of certain string patterns from string lists.

library(stringr)
library(tm)
library(dplyr)    
text = c('i am so hhappy happy now','you look ssad','sad day today','noway')
dat = sapply(c('happy', 'sad'), function(i) str_count(text, i))
dat = data.frame(dat)  
dat = dat %>% mutate(Sentiment = (happy)-(sad))

As a result, I can have a data frame like this

  happy sad Sentiment
1     2   0         2
2     0   1        -1
3     0   1        -1
4     0   0         0

In Python, I can assume rest of codes except sapply()

import pandas as pd
text = ['i am so hhappy happy now','you look ssad','sad day today','noway']
????
dat = pd.DataFrame(dat)
dat['Sentiment'] = dat.apply(lambda c: c.happy - c.sad)

What would ???? be?

1 Answer 1

8

You could use pd.Series.str.count:

import pandas as pd
import numpy as np

text = ['i am so hhappy happy now','you look ssad','sad day today','noway']
df = pd.DataFrame({'text' : text})

df['happy'] = df.text.str.count('happy')
df['sad'] = df.text.str.count('sad')
df['Sentiment'] = df.happy - df.sad

df    
                      text  happy  sad  Sentiment
0  i am so happy happy now      2    0          2
1             you look sad      0    1         -1
2            sad day today      0    1         -1
3                    noway      0    0          0
Sign up to request clarification or add additional context in comments.

4 Comments

And, just for even more details, you can construct that df above from your text list by doing df = pd.DataFrame([[sentence] for sentence in text], columns=['text'])
@Paul There's a simpler way. ;-)
Ahh, indeed there is! I probably should have thought of that. Thanks for adding it.
It is helpful!! Thank you so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.