Autocomplete functionality using elastic search

Question

I have an elastic search index with following documents and I want to have an autocomplete functionality over the specified fields:

mapping: https://gist.github.com/anonymous/0609b1d110d91dceb9a90faa76d1d5d4

Usecase:

My query is of the form prefix type eg "sta", "star", "star w" .."start war" etc with an additional filter as tags = "science fiction". Also there queries could match other fields like description, actors(in cast field, not this is nested). I also want to know which field it matched to.

I investigated 2 ways for doing that but non of the methods seem to address the usecase above:

1) Suggester autocomplete:

https://www.elastic.co/guide/en/elasticsearch/reference/1.7/search-suggesters-completion.html

With this it seems I have to add another field called "suggest" replicating the data which is not desirable.

2) using a prefix filter/query:

https://www.elastic.co/guide/en/elasticsearch/reference/1.7/query-dsl-prefix-filter.html

this gives the whole document back not the exact matching terms.

Is there a clean way of achieving this, please advise.

vinod_vh · Accepted Answer · 2016-11-01 03:38:12Z

1

Don't create mapping separately, insert data directly into index. It will create default mapping for that. Use below query for autocomplete.

GET /netflix/movie/_search
{
"query": {
    "query_string": {
        "query": "sta*"
    }
  }
}

answered Nov 1, 2016 at 3:38

vinod_vh

1,07111 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

user3784881 Over a year ago

Thanks for the reply, but this will return the whole document, what about just returning the terms in for that document as tags will be a long list.

vinod_vh Over a year ago

Can you share your mapping

vinod_vh Over a year ago

When user search you want to show only search_terms data..?

user3784881 Over a year ago

want both search terms and which fields it matched on. Thanks again.

ChintanShah25 · Accepted Answer · 2016-11-01 20:50:05Z

1

I think completion suggester would be the cleanest way but if that is undesirable you could use aggregations on name field.

This is a sample index(I am assuming you are using ES 1.7 from your question

PUT netflix
{
  "settings": {
    "analysis": {
      "analyzer": {
        "prefix_analyzer": {
          "tokenizer": "keyword",
          "filter": [
            "lowercase",
            "trim",
            "edge_filter"
          ]
        },
        "keyword_analyzer": {
          "tokenizer": "keyword",
          "filter": [
            "lowercase",
            "trim"
          ]
        }
      },
      "filter": {
        "edge_filter": {
          "type": "edge_ngram",
          "min_gram": 1,
          "max_gram": 20
        }
      }
    }
  },
  "mappings": {
    "movie":{
      "properties": {
        "name":{
          "type": "string",
          "fields": {
            "prefix":{
            "type":"string",
            "index_analyzer" : "prefix_analyzer",
            "search_analyzer" : "keyword_analyzer"
            },
            "raw":{
              "type": "string",
              "analyzer": "keyword_analyzer"
            }
          }
        },
        "tags":{
          "type": "string", "index": "not_analyzed"
        }
      }
    }
  }
}

Using multi-fields, name field is analyzed in different ways. name.prefix is using keyword tokenizer with edge ngram filter so that string star wars can be broken into s, st, sta etc. but while searching, keyword_analyzer is used so that search query does not get broken into multiple small tokens. name.raw will be used for aggregation.

The following query will give top 10 suggestions.

GET netflix/movie/_search
{
  "query": {
    "filtered": {
      "filter": {
        "term": {
          "tags": "sci-fi"
        }
      },
      "query": {
        "match": {
          "name.prefix": "sta"
        }
      }
    }
  },
  "size": 0,
  "aggs": {
    "unique_movie_name": {
      "terms": {
        "field": "name.raw",
        "size": 10
      }
    }
  }
}

Results will be something like

"aggregations": {
      "unique_movie_name": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "star trek",
               "doc_count": 1
            },
            {
               "key": "star wars",
               "doc_count": 1
            }
         ]
      }
   }

UPDATE :

You could use highlighting for this purpose I think. Highlight section will get you the whole word and which field it matched. You can also use inner hits and highlighting inside it to get nested docs also.

{
  "query": {
    "query_string": {
      "query": "sta*"
    }
  },
  "_source": false,
  "highlight": {
    "fields": {
      "*": {}
    }
  }
}

edited Nov 1, 2016 at 20:50

answered Nov 1, 2016 at 17:46

ChintanShah25

12.7k3 gold badges45 silver badges44 bronze badges

5 Comments

user3784881 Over a year ago

Thanks a lot for your solution. Here is the complete mapping: gist.github.com/anonymous/0609b1d110d91dceb9a90faa76d1d5d4, Is there a way to return which fields it matched on (it has nested fields too) as part of the result using your solution or any other way. Thanks a lot once again !

ChintanShah25 Over a year ago

what is the requirement? would like to suggest results(autocomplete) or want to know which fields user query matched? I thought you wanted to auto complete movie name field.

user3784881 Over a year ago

would like to suggest results (autocomplete) and also know which field that suggestion came from, the suggestion could come from movie name as well as from actor's name, description etc etc, so that in the auto suggest, the suggestion is displayed with what that entity is, eg: query = "ar", results in 1) "arnold schwarzenegger" as entity actor(it matched actor name) 2) arabian nights as movie (as it matched movie name). I really appreciate your help for the same.

ChintanShah25 Over a year ago

I have updated the answer. let me know if you need any further help

user3784881 Over a year ago

Thanks a lot for your help. highlights should work, will try it out !

suresh chaudhari · Accepted Answer · 2023-04-06 12:44:21Z

0

you can use lowercase filter for the elastic index.THis will help you to search upper case letters as well.

Create doc using below settings

   PUT lowercase_example
{
  "settings": {
    "analysis": {
      "analyzer": {
        "whitespace_lowercase": {
          "tokenizer": "whitespace",
          "filter": [ "lowercase" ]
        }
      }
    }
  },
 "mappings": {
    "properties": {
      "field1": { "type": "text" }
    }
  }
}

Now when you search you will get both of the fields included irrespective of lowercase and upper case

answered Apr 6, 2023 at 12:44

suresh chaudhari

91 bronze badge

Comments

papierkorp · Accepted Answer · 2025-01-28 13:36:31Z

I created this table for myself:

UseCase	Completion S.	Context S.	Term S.	Phrase S.	search_as_you_type	Edge N-Gram
Basic Auto-Complete	X	X			X	X
Flexible Search/Query					X	X
High Performace for Large Datasets	X	X	X	X
Higher Memory Usage	X	X				X
Higher Storage Usage					X	X
Substring Matches					X	X
Dynamic Data Updates			X	X	X	X
Relevance Scoring			X	X	X	X
Spell Correction			X	X
complexity to implement	low	high	medium	high	low	medium
Speciality	fast prefix matching	context-aware suggestions	single term corrections	multi term corrections	implements edge n-gram, full text partial matching

differentiate between Query Suggestion and Search

References

ever since the author asked, the search_as_you_type field was implemented which is exactly what author would have needed back then :D

Collectives™ on Stack Overflow

Autocomplete functionality using elastic search

4 Answers 4

4 Comments

5 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

4 Comments

5 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related