11

Is there a way to get at the individual probabilities using nltk.NaiveBayesClassifier.classify? I want to see the probabilities of classification to try and make a confidence scale. Obviously with a binary classifier the decision is going to be one or the other, but is there some way to see the inner workings of how the decision was made? Or, do I just have to write my own classifier?

Thanks

2
  • 2
    What have you tried? Have you tried working with the most_informative_features? show_most_informative_features? etc.? Commented Dec 25, 2013 at 13:19
  • 1
    Yes, of course, I am looking for a way to get the individual probabilities of classification after training. When I pass in a new document and it returns a decision. The classifier I have trained is working fine, I am wondering if there is a way to observe the decision probabilities of classifying a document with the already trained classifier Commented Dec 25, 2013 at 13:23

2 Answers 2

20

How about nltk.NaiveBayesClassifier.prob_classify?

http://nltk.org/api/nltk.classify.html#nltk.classify.naivebayes.NaiveBayesClassifier.prob_classify

classify calls this function:

def classify(self, featureset):
    return self.prob_classify(featureset).max()

Edit: something like this should work (not tested):

dist = classifier.prob_classify(features)
for label in dist.samples():
    print("%s: %f" % (label, dist.prob(label)))
Sign up to request clarification or add additional context in comments.

4 Comments

prob_classify is for training with unknown features. I have a trained classifier already. When I pass the classifier a new document, it classifies it. But a Naive Bayes classifier makes decisions based on probabilities and I am wondering if you can easily access those decision probabilities?
See my edit -- classify just returns the most probable label according to prob_classify. Where did you find that "prob_classify is for training with unknown features"? Btw. our discussion seems identical to groups.google.com/forum/#!topic/nltk-users/rZhvtVMhMXA
From the description in the link you posted. It explains that it is for classifying unlabeled documents if you have one document that is labelled. Like if you have positive sentiments documents labelled for your test set but not negative documents it will classify as pos and other which in this case would benegative
No, the documentation I posted refers to NaiveBayesClassifier. What you are talking about is below and talks about the positivenaivebayes module.
0

I know this is utterly old. But as I struggled some time to found this out i sharing this code.

It show the probability associatte with each feature in Naive Bayes Classifer. It helps me understand better how show_most_informative_features worked. Possible it is the best option to everyone (and much possible that's why they created this funcion). Anyway, for those like me that MUST SEE the individual probabily for each label and word, you can use this code:

for label in classifier.labels():
  print(f'\n\n{label}:')
  for (fname, fval) in classifier.most_informative_features(50):
    print(f"   {fname}({fval}): ", end="")
    print("{0:.2f}%".format(100*classifier._feature_probdist[label, fname].prob(fval)))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.