python nltk naive bayes probabilities

Question

Is there a way to get at the individual probabilities using nltk.NaiveBayesClassifier.classify? I want to see the probabilities of classification to try and make a confidence scale. Obviously with a binary classifier the decision is going to be one or the other, but is there some way to see the inner workings of how the decision was made? Or, do I just have to write my own classifier?

Thanks

What have you tried? Have you tried working with the most_informative_features? show_most_informative_features? etc.? — Justin O Barber
– Justin O Barber, Commented Dec 25, 2013 at 13:19
Yes, of course, I am looking for a way to get the individual probabilities of classification after training. When I pass in a new document and it returns a decision. The classifier I have trained is working fine, I am wondering if there is a way to observe the decision probabilities of classifying a document with the already trained classifier — English Grad
– English Grad, Commented Dec 25, 2013 at 13:23

ales_t · Accepted Answer · 2013-12-25 14:41:14Z

20

How about nltk.NaiveBayesClassifier.prob_classify?

http://nltk.org/api/nltk.classify.html#nltk.classify.naivebayes.NaiveBayesClassifier.prob_classify

classify calls this function:

def classify(self, featureset):
    return self.prob_classify(featureset).max()

Edit: something like this should work (not tested):

dist = classifier.prob_classify(features)
for label in dist.samples():
    print("%s: %f" % (label, dist.prob(label)))

edited Dec 25, 2013 at 14:41

answered Dec 25, 2013 at 13:56

ales_t

2,02713 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

English Grad Over a year ago

prob_classify is for training with unknown features. I have a trained classifier already. When I pass the classifier a new document, it classifies it. But a Naive Bayes classifier makes decisions based on probabilities and I am wondering if you can easily access those decision probabilities?

ales_t Over a year ago

See my edit -- classify just returns the most probable label according to prob_classify. Where did you find that "prob_classify is for training with unknown features"? Btw. our discussion seems identical to groups.google.com/forum/#!topic/nltk-users/rZhvtVMhMXA

English Grad Over a year ago

From the description in the link you posted. It explains that it is for classifying unlabeled documents if you have one document that is labelled. Like if you have positive sentiments documents labelled for your test set but not negative documents it will classify as pos and other which in this case would benegative

ales_t Over a year ago

No, the documentation I posted refers to NaiveBayesClassifier. What you are talking about is below and talks about the positivenaivebayes module.

Luciano Camilo · Accepted Answer · 2021-10-27 22:29:33Z

I know this is utterly old. But as I struggled some time to found this out i sharing this code.

It show the probability associatte with each feature in Naive Bayes Classifer. It helps me understand better how show_most_informative_features worked. Possible it is the best option to everyone (and much possible that's why they created this funcion). Anyway, for those like me that MUST SEE the individual probabily for each label and word, you can use this code:

for label in classifier.labels():
  print(f'\n\n{label}:')
  for (fname, fval) in classifier.most_informative_features(50):
    print(f"   {fname}({fval}): ", end="")
    print("{0:.2f}%".format(100*classifier._feature_probdist[label, fname].prob(fval)))

Collectives™ on Stack Overflow

python nltk naive bayes probabilities

2 Answers 2

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related