2

I have a list that I'm adding to a pandas data frame it contains a range of decimal values. I want to divide it into 3 ranges each range represents one value

sents=[]
for sent in sentis:
if sent > 0:
    if sent < 0.40:
        sents.append('negative')
    if (sent >= 0.40 and sent <= 0.60):
        sents.append('neutral')
    if sent > 0.60
        sents.append('positive')

my question is if there is a more efficient way in pandas to do this as i'm trying to implement this on a bigger list and

Thanks in advance.

2 Answers 2

2

You can use pd.cut to produce the results that are of type categorical and have the appropriate labels.

In order to fix the inclusion of .4 and .6 for the neutral category, I add and subtract the smallest float epsilon

sentis = np.linspace(0, 1, 11)
eps = np.finfo(float).eps

pd.DataFrame(dict(
        Value=sentis,
        Sentiment=pd.cut(
            sentis, [-np.inf, .4 - eps, .6 + eps, np.inf],
            labels=['negative', 'neutral', 'positive']
        ),
    ))

   Sentiment  Value
0   negative    0.0
1   negative    0.1
2   negative    0.2
3   negative    0.3
4    neutral    0.4
5    neutral    0.5
6    neutral    0.6
7   positive    0.7
8   positive    0.8
9   positive    0.9
10  positive    1.0
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot exactly what i'm looking for
0

List comprehension:

['negative' if x < 0.4 else 'positive' if x > 0.6 else 'neutral' for x in sentis]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.