0

This woking code gives me the 5 most relevant documents for a topic out of my corpus.

most_relevant_docs = sorted(bow_corpus, reverse=True, key=lambda doc: abs(dict(doc).get(topic_number, 0.0))) 
print most_relevant_docs[ :5]

But since the corpus is not readable by human I want to zip an index to the corpus so I can recover the depending documents.

corpus_ids = range(0,len(corpus))
most_relevant_docs = sorted(zip(corpus_ids, bow_corpus), reverse=True, key=lambda my_id, doc : abs(dict(doc).get(topic_number, 0.0)))
print most_relevant_docs[ :5]

Where do I have to adapt the lambda function so it returns the id together with the document?

2
  • Can you mock up some data so we can visualize what you are trying to achieve? Of course, as it stands, we can't run any of your code. Commented Jun 29, 2018 at 9:36
  • the lambda is used as a key only, will not modify the structure of the output but the order of it Commented Jun 29, 2018 at 9:38

1 Answer 1

2

Try this

sortingFunc = lambda doc: abs(dict(doc).get(topic_number, 0.0))
corpus_ids = range(0,len(corpus))
most_relevant_docs = sorted(zip(corpus_ids, bow_corpus), reverse=True, key=lambda pair: sortingFunc(pair[1]))

When you zip it, each element becomes like (index, value), so the original sorting key wouldn't work. You'd need to modify it so it sorts by the value as opposed to the pair

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.