I've got a list comprehension that isn't sorting once I add the 'not in stop' method. Basically, the sorting function I had before is lost now when I include stopwords for this NLTK. Can anyone point out what I did wrong?
I have now included everything in the code for better reference.
EDITED:
from nltk import word_tokenize
from nltk.corpus import stopwords
import string
stop = stopwords.words('english') + list(string.punctuation)
f = open('review_text_all.txt', encoding="utf-8")
raw = f.read().lower().replace("'", "").replace("\\", "").replace(",",
"").replace("\ufeff", "")
tokens = nltk.word_tokenize(raw)
bgs = nltk.bigrams(tokens)
fdist = nltk.FreqDist(bgs)
for (k,v) in sorted(fdist.items(), key=lambda x: (x[1] not in stop),
reverse=True):
print(k,v)
Here is my result w/'not in stop'
('or', 'irish') 3
('put', 'one') 1
('was', 'repealed') 1
('please', '?') 6
('contact', 'your') 2
('wear', 'sweats') 1
without 'not in stop'
('white', 'people') 4362
('.', 'i') 3734
('in', 'the') 2880
('of', 'the') 2634
('to', 'be') 2217
('all', 'white') 1778
as you can see the sorted works, but only once I remove the 'not in stop'
fdistand what is your desired sorted output? Include minimal examples