I try to label my data training in the form of a document with 2 categories, namely positive and negative, by separating per word on the document with the tokenizing method then the record in the tokenized document is compared with a sentiment dictionary how many positive words and negative words in that 1 record, then the total number of positive and negative values is compared, whichever is more, then the record will be labeled according to the sentiment whose value is more dominant. I need a glimpse of how to do it in python
4
-
have you chosen a sentiment lexicon? that would be a good place to start, there are many good ones available but you need to bear in mind what languages it contains compared to your data and the dialect it was meant for, using a US lexicon for UK or Australian data can cause issuesPatrick– Patrick2021-11-05 08:29:28 +00:00Commented Nov 5, 2021 at 8:29
-
Yes i've checked that dictionary, but idk how to implement it to my problem TTArkan– Arkan2021-11-05 09:42:11 +00:00Commented Nov 5, 2021 at 9:42
-
if you have a lexicon then you have a ladled dataset, what your asking would be better suited to looking for a sentiment analysis tutorial and then asking more specific questions if you encounter problemsPatrick– Patrick2021-11-05 09:46:15 +00:00Commented Nov 5, 2021 at 9:46
-
Yes ive been doing it and i got another problem, please visit this link stackoverflow.com/questions/69851442/…Arkan– Arkan2021-11-05 10:00:03 +00:00Commented Nov 5, 2021 at 10:00
Add a comment
|