Sorted key value lambda not working

Question

I've got a list comprehension that isn't sorting once I add the 'not in stop' method. Basically, the sorting function I had before is lost now when I include stopwords for this NLTK. Can anyone point out what I did wrong?

I have now included everything in the code for better reference.

EDITED:

from nltk import word_tokenize
from nltk.corpus import stopwords
import string

stop = stopwords.words('english') + list(string.punctuation)
f = open('review_text_all.txt', encoding="utf-8")
raw = f.read().lower().replace("'", "").replace("\\", "").replace(",", 
"").replace("\ufeff", "")

tokens = nltk.word_tokenize(raw)

bgs = nltk.bigrams(tokens)

fdist = nltk.FreqDist(bgs)
for (k,v) in sorted(fdist.items(), key=lambda x: (x[1] not in stop), 
reverse=True):
    print(k,v)

Here is my result w/'not in stop'

('or', 'irish') 3
('put', 'one') 1
('was', 'repealed') 1
('please', '?') 6
('contact', 'your') 2
('wear', 'sweats') 1

without 'not in stop'

('white', 'people') 4362
('.', 'i') 3734
('in', 'the') 2880
('of', 'the') 2634
('to', 'be') 2217
('all', 'white') 1778

as you can see the sorted works, but only once I remove the 'not in stop'

what is fdist and what is your desired sorted output? Include minimal examples — Chris_Rands
– Chris_Rands, Commented Sep 26, 2017 at 14:48
do you want to sort or to filter the list ? Because sorting on a boolean criteria will almost certainly not produce what you expect. — Guillaume
– Guillaume, Commented Sep 26, 2017 at 14:51
Perhaps you need to first apply the filter function, and then sort. As already written, your function for sorting is incorrect — Nuchimik
– Nuchimik, Commented Sep 26, 2017 at 14:57

Cédric Julien · Accepted Answer · 2017-09-26 15:33:07Z

4

The key parameter of the sorted method is a function that will let you tell python on which key (attribute/value related to the item of the list) to sort.

In your case, your function will return True or False.... which are not really good values to make a sort :)

EDIT:

from what I understand of what you want to achieve, you need to add before (or after) the sort a filter method that will remove from your list the items which are in your "stop words" list.

Something like this :

for (k,v) in sorted(filter(lambda x: (x[1] not in stop), fdist.items()), key=lambda x: x[1], reverse=True):
    print(k,v)

edited Sep 26, 2017 at 15:33

answered Sep 26, 2017 at 14:50

Cédric Julien

81.2k16 gold badges131 silver badges134 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

M4cJunk13 Over a year ago

It worked, but not exactly the way I needed it to. It sorted by the keys, but I actually need the values to be sorted from highest to lowest.

Cédric Julien Over a year ago

@M4cJunk13 I updated my answer with the (I think) correct comparison method (bvased on the apparition frequency of the words)

M4cJunk13 Over a year ago

Perfect, it worked!!! Thank you so much. I'm still trying to get a better understanding at using lambdas.

Collectives™ on Stack Overflow

Sorted key value lambda not working

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related