3

In ElasticSearch is it possible to randomize the order of search results with equal score without losing pagination?

I'm hosting a database with thousands of job candidates. When a company are searching for a particular skill (or a combination of skills), it's always the same order (and thus the candidates in the top of search results are having a huge advantage)

Example for a search query:

let params = {
      index: 'candidates',
      type: 'candidate',
      explain: true,
      size: size,
      from: from,
      body: {
        _source: {
          includes: ['firstName', 'middleName', 'lastName']
        },
        query: {
          bool: {
            must: [/* Left out */],
            should: [/* Left out */],
          }
        }
      }
    };
1
  • You could use rescoring to randomize top-K results. Commented Feb 22, 2020 at 18:39

2 Answers 2

1

Henry's answer is good, but I think it is easier to do:

        function_score: {
          query: {
            ...
          },
          random_score: {
            seed: 12345678910,
            field: '_seq_no',
            weight: 0.0001
          },
          boost_mode: 'sum'

So there is no need to boost the original score, just weight the random score down so that it contributes little (but still enough to break ties).

I do dislike such approach to break ties though, because even if you are contributing just a little to the score, you could still change order of results between results which do not have the same score, but have the score very close. This is why I opened this feature request.

Sign up to request clarification or add additional context in comments.

Comments

0

You could use a function_score query, wrap your bool query in it and add a random_score function. Next step is to find the good weighting that match your needs using "boost" and "boost_mode" or "weight"...

Note that if you use filters the output score will be 0 so you will need to change the "boost_mode" from "multiply" to "replace", "sum" or something else...

Finally, don't forget to add a seed (and field as of ES 7.0) to the random_score to keep a near-consistent pagination

From your example I would suggest something like :

let params = {
      ...
      body: {
        ...
        function_score: {
          query: {
            bool: {
              must: [/* Left out */],
              should: [/* Left out */],
              boost: 100
            }
          },
          random_score: {
            seed: 12345678910,
            field: '_seq_no'
          },
          boost_mode: 'sum'
        }
      }
    };

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.