I have a text file with a sentence on each line: eg ""Have you registered your email ID with your Bank Account?"
I want to classify it into interrogative or not. FYI these are sentences from bank websites. I've seen this answer with this nltk code block:
import nltk
nltk.download('nps_chat')
posts = nltk.corpus.nps_chat.xml_posts()[:10000]
def dialogue_act_features(post):
features = {}
for word in nltk.word_tokenize(post):
features['contains({})'.format(word.lower())] = True
return features
featuresets = [(dialogue_act_features(post.text), post.get('class')) for post in posts]
size = int(len(featuresets) * 0.1)
train_set, test_set = featuresets[size:], featuresets[:size]
classifier = nltk.NaiveBayesClassifier.train(train_set)
print(nltk.classify.accuracy(classifier, test_set))
So I did some preprocessing to my text file i.e. stemming words, removing stop words etc, to make each sentence into a bag of words. From the code above, I have a trained classifier. How do I implement it on my text file of sentences (either raw or preprocessed)?
Update: here is an example of my text file.