How to write a composite Elasticsearch query including query_string for multiple fields?

Question

My object looks like this

{
      "country_id":3,
      "user_id":22,
      "name": "John",
      "surname": "Wright",
      "city_name":"Sydney",
}

I want to do this:

SELECT * FROM STUDENT WHERE user_id= :1 AND country_id= :2 
AND LOWER(name) LIKE '%' || :3 || '%' 
OR LOWER(surname) LIKE '%' || :4 || '%' 
OR LOWER(city_name) LIKE '%' || :5 || '%' 
OFFSET :6 ROWS FETCH NEXT :7 ROWS ONLY

I tried the following:

curl -XPOST "http://xxx.xxx.xxx.x:9200/xxxx/students/_search" -d '{
  "from": 6, "size": 11,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "country_id": "123"
          },
          {
          "term": {
            "user_id": "abc35"
          }
        },
        {
          "query_string": {
            "query": "name:*abc*",
            "query": "surname:*abc*",
            "query": "city_name:*abc*",
          }
        }
      ]
    }
  }
}

User's search string will be applied on fields name, surname, city_name etc.

Can someone please point out what I'm missing? I want the smallest possible query as there can be multiple fields to be passed in query strings for user's search query to be applied on (like school_name, hobbies, education).

It is also worth noting that running prefix wildcard queries is not efficient and might hinder the performance of your cluster depending on the volume of data you have. You should see if using the wildcard field type instead is more beneficial for your use case. — Val
– Val, Commented Mar 1, 2021 at 5:02
@noobie I added a wildcard style query and some links of another approaches. Let elasticsearch to shine and try to avoid the like% style queries. cheers — llermaly
– llermaly, Commented Mar 1, 2021 at 8:33

llermaly · Accepted Answer · 2021-03-01 08:32:03Z

Ingest Data

POST test_noobie/_doc
{
  "country_id": 3,
  "user_id": 22,
  "name": "John",
  "surname": "Wright",
  "city_name": "Sydney"
}

Query

POST test_noobie/_search
{
  "query": {
    "bool": {
      "must": {
        "query_string": {
          "query": "*ydn*",
          "fields": [
            "city_name",
            "name",
            "surname"
          ]
        }
      },
      "filter": [
        {
          "term": {
            "user_id": 22
          }
        },
        {
          "term": {
            "country_id": 3
          }
        }
      ]
    }
  }
}

Note I put the id related filters inside the filter scope.This is more efficient because we dont care about scoring on exact matches so it is ommited.

Response

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test_noobie",
        "_type" : "_doc",
        "_id" : "Tsck7HcB50NMsuQPC1TV",
        "_score" : 1.0,
        "_source" : {
          "country_id" : 3,
          "user_id" : 22,
          "name" : "John",
          "surname" : "Wright",
          "city_name" : "Sydney"
        }
      }
    ]
  }
}

Wild cards are expensive by the way, and it is not optimal to use it. You can instead read more about other strategies, I will attach some links :

Fuzzy Queries
Suggesters (for 'did you mean' functionalities)
Stemming

One of the big advantages of moving to a Full text search engine is to use the built in functions as much as possible before starting to use wildcards, regex, or painless scripts.

Collectives™ on Stack Overflow

How to write a composite Elasticsearch query including query_string for multiple fields?

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related