0

I am indexing a corpus of documents (news articles,forum posts etc.) into an Elasticsearch. To provide better search, I have also trained a SVM+Tf-Idf model classifying document to generate tags into a taxonomy e.g. News- Politics, News-Sports,Post-US Politics etc. My question: how do I weight the scores generated by the classifier to writing the document into ES?

I have been using a hackish approach, for example if I get the score of 0.7 for News-Sports, I write the ["News-Sports"] * int(score*10) i.e. write News-Sports as 7 terms into the tags field of the document.

Are there better ways of doing index-time weighting?

1 Answer 1

0

I'm not sure if I entirely understand your question. I understand it as how do you add a weight for each generated tag that could impact relevancy.

If that's the case, you could make use of field_value_factor. You could write both tag and it's weight into a document and then use a function query to boost by these values.

https://www.elastic.co/guide/en/elasticsearch/guide/master/boosting-by-popularity.html

Sign up to request clarification or add additional context in comments.

1 Comment

yep, exactly what i needed. Thanks so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.