1

My object looks like this

{
      "country_id":3,
      "user_id":22,
      "name": "John",
      "surname": "Wright",
      "city_name":"Sydney",
}

I want to do this:

SELECT * FROM STUDENT WHERE user_id= :1 AND country_id= :2 
AND LOWER(name) LIKE '%' || :3 || '%' 
OR LOWER(surname) LIKE '%' || :4 || '%' 
OR LOWER(city_name) LIKE '%' || :5 || '%' 
OFFSET :6 ROWS FETCH NEXT :7 ROWS ONLY

I tried the following:

curl -XPOST "http://xxx.xxx.xxx.x:9200/xxxx/students/_search" -d '{
  "from": 6, "size": 11,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "country_id": "123"
          },
          {
          "term": {
            "user_id": "abc35"
          }
        },
        {
          "query_string": {
            "query": "name:*abc*",
            "query": "surname:*abc*",
            "query": "city_name:*abc*",
          }
        }
      ]
    }
  }
}

User's search string will be applied on fields name, surname, city_name etc.

Can someone please point out what I'm missing? I want the smallest possible query as there can be multiple fields to be passed in query strings for user's search query to be applied on (like school_name, hobbies, education).

2
  • It is also worth noting that running prefix wildcard queries is not efficient and might hinder the performance of your cluster depending on the volume of data you have. You should see if using the wildcard field type instead is more beneficial for your use case. Commented Mar 1, 2021 at 5:02
  • @noobie I added a wildcard style query and some links of another approaches. Let elasticsearch to shine and try to avoid the like% style queries. cheers Commented Mar 1, 2021 at 8:33

1 Answer 1

1

Ingest Data

POST test_noobie/_doc
{
  "country_id": 3,
  "user_id": 22,
  "name": "John",
  "surname": "Wright",
  "city_name": "Sydney"
}

Query

POST test_noobie/_search
{
  "query": {
    "bool": {
      "must": {
        "query_string": {
          "query": "*ydn*",
          "fields": [
            "city_name",
            "name",
            "surname"
          ]
        }
      },
      "filter": [
        {
          "term": {
            "user_id": 22
          }
        },
        {
          "term": {
            "country_id": 3
          }
        }
      ]
    }
  }
}

Note I put the id related filters inside the filter scope.This is more efficient because we dont care about scoring on exact matches so it is ommited.

Response

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test_noobie",
        "_type" : "_doc",
        "_id" : "Tsck7HcB50NMsuQPC1TV",
        "_score" : 1.0,
        "_source" : {
          "country_id" : 3,
          "user_id" : 22,
          "name" : "John",
          "surname" : "Wright",
          "city_name" : "Sydney"
        }
      }
    ]
  }
}

Wild cards are expensive by the way, and it is not optimal to use it. You can instead read more about other strategies, I will attach some links :

One of the big advantages of moving to a Full text search engine is to use the built in functions as much as possible before starting to use wildcards, regex, or painless scripts.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.