1

I have some documents in elastic search with completion suggester. I search for some value like Stack, the results are shown in the order given below:

  1. Stack Overflow
  2. Stack-Overflow
  3. Stack
  4. StackOver
  5. StackOverflow

I want the result to be displayed in the order:

  1. Stack
  2. StackOver
  3. StackOverflow
  4. Stack Overflow
  5. Stack-Overflow

i.e, the exacts matches should come first instead of results which space or special characters. TIA

1 Answer 1

1

It all depends on the way you are analysing the string you are querying upon. I will suggest that you apply more than one analyser on the same string field. Below is an example of the mapping of the "name" field over which you want auto complete/suggester feature:

"name": {
    "type": "string",
    "analyzer": "keyword_analyzer",
    "fields": {
        "name_ac": {
            "type": "string",
            "index_analyzer": "string_autocomplete_analyzer",
            "search_analyzer": "keyword_analyzer"
        }
    }
}

Here, keyword_analyzer and string_autocomplete_analyzer are analysers defined in your index settings. Below is an example:

"keyword_analyzer": {
    "type": "custom",
    "filter": [
        "lowercase"
    ],
    "tokenizer": "keyword"
}

"string_autocomplete_analyzer": {
    "type": "custom",
    "filter": [
        "lowercase"
        ,
        "autocomplete"
    ],
    "tokenizer": "whitespace"
}

Here autocomplete is an analysis filter:

"autocomplete": {
    "type": "edgeNGram",
    "min_gram": "1",
    "max_gram": "10"
}

After having set this, when searching in Elasticsearch for the auto suggestions, you can make use of multiMatch queries instead of the normal match queries and here you provide boosts to individual fields in the multiMatch. Below is a example in java:

QueryBuilders.multiMatchQuery(yourSearchString,"name^3","name_ac");

You may need to alter the boost (^3) as per your needs.

If even this does not satisfy your requirements, you can look at having one more analyser which analyse the string based on first word and include that field in the multiMatch. Below is an example of such an analyser:

"first_word_name_analyzer": {
    "type": "custom",
    "filter": [
        "lowercase"
        ,
        "whitespace_merge"
        ,
        "edgengram"
    ],
    "tokenizer": "keyword"
}

With these analysis filters:

"whitespace_merge": {
    "pattern": "\s+",
    "type": "pattern_replace",
    "replacement": " "
},
"edgengram": {
    "type": "edgeNGram",
    "min_gram": "1",
    "max_gram": "32"
}

You may have to do some trials on the boost values in order to reach the most optimum results based on your requirements. Hope this helps.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.